A Rate-Compatible Sphere-Packing Analysis of 
Feedback Coding with Limited Retransmissions 



Adam R. Williamson, Tsung-Yi Chen and Richard D. Wesel 
Department of Electrical Engineering 
University of California, Los Angeles 
Los Angeles, California 90095 
Email: adamroyce@ucla.edu; tychen@ee.ucla.edu; wesel@ee.ucla.edu 



(N 
O 



C/2 



in 

O 

(N 



X: 



Abstract — Recent work by Polyanskiy et al. and Chen et al. 
have excited new interest in using feedback to approach capacity 
with low latency. Polyanskiy showed that feedback identifying 
the first symbol at which decoding is successful allows capacity 
to be approached with surprisingly low latency. This paper uses 
Chen's rate-compatible sphere-packing (RCSP) analysis to study 
what happens when symbols must be transmitted in packets, as 
with a traditional hybrid ARQ system, and limited to relatively 
few (six or fewer) incremental transmissions. 

Numerical optimizations find the series of progressively grow- 
ing cumulative block lengths that enable RCSP to approach 
capacity with the minimum possible latency. RCSP analysis shows 
that five incremental transmissions are sufficient to achieve 92% 
of capacity with an average block length of fewer than 101 
symbols on the AWGN channel with SNR of 2.0 dB. 

The RCSP analysis provides a decoding error trajectory that 
specifies the decoding error rate for each cumulative block 
length. Though RCSP is an idealization, an example tail-biting 
convolutional code matches the RCSP decoding error trajectory 
and achieves 91% of capacity with an average block length of 
102 symbols on the AWGN channel with SNR of 2.0 dB. We also 
show how RCSP analysis can be used in cases where packets have 
deadlines associated with them (leading to an outage probability). 



I. Introduction 

Though Shannon showed in 1956 lU that noiseless feed- 
back does not increase the capacity of memoryless channels, 
feedback's other benefits have made it a staple in modern com- 
munication systems. Feedback can simplify the encoding and 
decoding operations and has been incorporated into incremen- 
tal redundancy (IR) schemes proposed as early as 1974 f2l|. 
Hagenauer's work on rate-compatible punctured convolutional 
(RCPC) codes allows the same encoder to be used in various 
channel conditions and uses feedback to determine when to 
send additional coded bits The combination of IR and 
hybrid ARQ (HARQ) continues to receive attention in the 
literature [4|, |5 | and industry standards such as 3GPR 

Although it cannot increase capacity in point-to-point chan- 
nels, the information-theoretic benefit of feedback for reduc- 
ing latency through a significant improvement in the error 
exponent has been well understood for some time. (See, for 
example, fSj-IlJ.) Recent work lfT0l - llT3l casts the latency 
benefit of feedback in terms of block length rather than error 
exponent, generating new interest in the practical value of 
feedback for approaching capacity with a short average block 
length. 



Polyanskiy et al. provided achievability bounds for the 
maximum rate that can be accomplished with feedback for a 
finite block length ifTOI . ifTTl and also demonstrated the energy- 
efficiency gains made possible by feedback fT?]. Polyanskiy 
uses an elegant, single, "stop feedback" symbol (that can occur 
after any transmitted symbol) that facilitates the application 
of Martingale theory to capture the essence of how feedback 
can allow a variable-length code to approach capacity. A 
compelling example from Polyanskiy et al. shows that for a 
binary symmetric channel with capacity 1/2, the average block 
length required to achieve 90% of the capacity is smaller than 
200 symbols. 

For practical systems such as hybrid ARQ, the "stop feed- 
back" symbol may only be feasible at certain symbol times 
because these systems group symbols together for transmission 
in packets, so that the entire packet is either transmitted or 
not. In fT2l . lfT3l . Chen et al. used a code-independent rate- 
compatible sphere-packing (RCSP) analysis to quantify the 
latency benefits of feedback in the context of such grouped 
transmissions. Chen et al. focused on the AWGN channel and 
also showed that capacity can be approached with surprisingly 
small block lengths, similar to the results of fTOl, ifTTI . 
However, this initial work of Chen et al. is limited to a 
demonstration example that required several approximations. 

Using the RCSP approach of Chen et al. as its foundation, 
this paper introduces an optimization technique and uses it to 
explore how closely one may approach capacity with only a 
handful of incremental transmissions. For a fixed number of 
information bits k and a fixed number of maximum transmis- 
sions m, a numerical optimization algorithm is introduced that 
determines the block lengths of each incremental transmission 
to maximize the expected throughput. We consider only m < 6 
and show that this is sufficient to achieve more than 90% 
of capacity while requiring surprisingly small block lengths 
similar to those achieved by Polyanskiy et al. and Chen et al. 

While RCSP is an idealized scheme, it provides meaningful 
guidance for the selection of block lengths and the sequence of 
target decoding error rates, which we call the decoding error 
trajectory. A 1024-state rate-compatible punctured tail -biting 
convolutional code using the block lengths determined by our 
RCSP optimization technique achieves the RCSP decoding 
error trajectory and essentially matches the throughput and 
latency performance of RCSP for to = 5 transmissions. Our 



results, like those of Polyanskiy et al. and Chen et al., assume 
that the receiver is able to recognize when it has successfully 
decoded. The additional overhead of, for example, a cyclic 
redundancy check (CRC) byte has not been included in the 
analysis. Somewhat longer block lengths would be required to 
overcome this overhead penalty. 

The paper is organized as follows: Section reviews the 
RCSP analysis. Section |lll] describes the RCSP numerical 
optimization algorithm used to determine transmission lengths 
and shows the throughput vs. latency performance achieved by 
using these transmission lengths for up to six rate-compatible 
transmissions. Section |IV] introduces the decoding error tra- 
jectory and shows how RCSP performance can be matched 
by a real convolutional code using the transmission lengths 
identified in the previous section. Section |V] shows how 
the RCSP analysis can be applied to scenarios that involve 
strict latency and outage probability constraints. Section [Vl] 
concludes the paper 

II. Rate-Compatible Sphere-Packing (RCSP) 

A. Review of Sphere-Packing 

Let us briefly review the sphere-packing analysis presented 
in lfT2ll . ifTSl for a memoryless AWGN channel. Consider a 
codebook of size 2*^ that maps k — NRc information symbols 
into a length-iV codeword with rate Re- The channel input and 
output can be written as: 



Y = X{j) + Z,j el,2,...,2\ 



(1) 



where Y is the output (received word), X{j) is the codeword 
of the jth message, and Z is an A^-dimensional i.i.d. Gaussian 
vector. Let the received SNR be ry and assume without loss 
of generality that each noise sample has unit variance. The 
average power of received word Y is then A^(l + 77). As in 
lfT2l . the largest possible squared decoding radius assuming 
that spheres occupy all available volume is 

2 ^(1 + v) 



22k/N 



(2) 



A bounded-distance decoder declares any message within a 
distance r of codeword X{j) to be message j. Otherwise, 
a decoding error is declared. Because the N Gaussian noise 
samples together obey a chi-square distribution with N degrees 
of freedom, the probability P(C) of decoding error associated 
with decoding radius r is 



N 



PiC) = p 



(3) 



where the Z( are standard normal distributed random variables 
with zero mean and unit variance and F^2 (x) is the CDF of 
a chi-square distribution with N degrees of freedom. 

B. Sphere-Packing for Rate-Compatible Transmissions 

The idea of RCSP is to assume that sphere-packing per- 
formance can be achieved by each transmission in a se- 
quence of rate-compatible transmissions. Thus the idealized 
sphere-packing analysis is applied to a modified incremental 



redundancy with feedback (MIRF) scheme as described in 
|12|. MIRF works as follows: k information symbols are 
coded with an initial block length iVi — Ii. If the receiver 
cannot successfully decode, the transmitter will receive a 
NACK and send I2 extra symbols. The decoder attempts 
to decode again using all received symbols for the current 
codeword, i.e., with block length N2 = Ii + 12- The process 
continues for i = 3, . . . ,m, where m is the maximum number 
of transmissions. The decoded block length Nj of the jth 
j 

transmission is ^ 1.^. If decoding is not successful after m 

transmissions, the decoder discards the m transmissions and 
the process begins again with the transmitter resending the Ii 
initial symbols. This scheme with m = 1 is standard ARQ. 

The squared decoding radius of the jth cumulative trans- 
mission is 

^ iV,-(l + ^) 

and the marginal probability of decoding error P{Cj) associ- 
ated with decoding radius rj is 



P{Q) = P\^zt>r^l 



(5) 



where the zi are standard normal distributed random variables 
with zero mean and unit variance. 

However, this marginal probability is not what is needed. 
The probability of a decoding error in the jth transmission 
depends on previous error events. Indeed, conditioning on 
previous decoding errors (i, . . . makes the error event 

(j more likely than the marginal distribution would suggest. 
The joint probability P(Ci, ■ ■ ■ ,Cj) is 



P{Ci,.-.X,)^P(f]Q 



1=1 
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(6) 



We compute the expected number of channel uses (i.e., 
latency or average block length) A and the expected number 
of transmissions r as 

m I i—1 \ ra — 1 ( ^ \ 

h + Y.hP\i^cA 1 + E ^ n 

i=2 = l / i=l \i=l / 

A= . ^ . ^ , T = ,^ . ' . (7) 





The expected throughput Rt is given by 



^1 + E n 

z=2 Vj=l , 



(8) 



TABLE I 

Optimized TRANSMISSION LENGTHS A^i = h and rates Rc for 

IDEAL-SPHERE-PACKING ARQ WITH INFORMATION LENGTHS k. 



SNR = 2.0 dB, Capacity = 0.6851 



k 


16 


32 


64 


128 


256 




31 


60 


116 


222 


429 


Rc 


0.516 


0.533 


0.552 


0.577 


0.597 



III. Choosing li Values to Maximize Throughput 

A. Selecting Ii for the to = 1 (ARQ) Special Case 

In the special case of to = 1 (when only the initial 
transmission of length Ii is ever transmitted), MIRF is ARQ. 
In this case the expected number of channel uses given by (I?) 
can be simplified as follows: 

h h 



A 



ARQ 



l-P(Ci) F^.{riy 



which yields an expected throughput of 

RtARQ = {k/h)F^. (r?) = (r2), 



(9) 



(10) 



where r? = '^1^^^^+^^ 



If we fix the number of information 
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bits fc, ( fTOb becomes a quasiconcave function of the initial 
code rate Rc = k/Ii fl6|, allowing the optimal code rate 
R°P*, which maximizes the throughput Rt for a given k, to 
be found numerically. 

Table U presents the optimal initial code rates Rc and 
transmission lengths A^i = Ii for ARQ assuming ideal 
sphere-packing computed using (fTol i. with the restriction that 
the lengths Ii must be integers. The maximum achievable 
throughput in the m — 1 (ARQ) RCSP scheme is plotted 
as the red (diamond markers) curve in Fig. [T] For to > 1, 
identifying the transmission lengths Ii which minimize the 
latency A in (|7]i is not straightforward due to the joint decoding 
error probabilities in (|6]l. The next section describes a method 
for obtaining a good set of transmission lengths Ii for the 
TO > 1 case. 

B. Algorithm for Selecting Ii Values 

In IIT2I . Chen et al. demonstrated one specific RCSP scheme 
with ten transmissions that could approach capacity with low 
latency. Specifically, the transmission lengths were fixed to 
Ii = 64 and /2,...,/io = 10, while k was varied to 
maximize throughput. In constrast, in this paper we build on 
the intuition of the ARQ case presented above to fix both k 
and the number of transmissions m, in order to search for the 
set of transmission lengths Ii that maximizes throughput. We 
seek to identify approximately how much throughput can be 
achieved using feedback with a small number of incremental 
transmissions, specifically to < 6. Furthermore, we seek 
insight into what the transmission lengths should be and what 
decoding error rates allow the sequence of transmissions to be 
most efficient. 

The restriction to a small to allows exact computation of 
(|6]l in Mathematica, avoiding the approximations of lfT2l . To 
reflect practical constraints, we restrict the lengths Ii to be 
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Fig. 1. Throughput vs. latency for rate-compatible sphere-packing with m 
rate-compatible transmissions m G {1, . . . , 6} with transmission lengths Ii 
identified according to the algorithm in Fig|2] 



integers. Fig. |2] presents the algorithm for selecting the Ii 
values. 

The computational complexity of which increases with 
the transmission index j, forces us to limit attention to a 
well-chosen subset of possible transmission lengths. Thus, our 
present results may be considered as lower bounds to what is 
possible with a fully exhaustive optimization. Fig. [T] shows 
the throughput vs. latency performance achieved by RCSP 
using the algorithm in Fig. |2] for to e {1, ... 6} on an AWGN 
channel with SNR 2.0 dB. As to is increased, each additional 
retransmission brings the expected throughput Rt closer to 
the channel capacity, though with diminishing returns. The 
scenario in ifTTll . in which the variable-length feedback (VLF) 
codes studied allow feedback after each individual bit is 
transmitted, may be considered the limit of MIRF as to 00. 



1) Fix k and m. 

2) for transmission index i = 1, . . . , m do 

3) for specified ranges of values for Ii, . . . ,Ii do 

4) Compute the squared decoding radius rf as in (|4]i. 

5) Compute the joint probability of decoding error 

i 

after the ith transmission, P{ p| (j), as in (|6]l. 

6) if i = TO then 

7) Compute and store the expected latency A and 
number of transmissions r as in (|7]i. 

8) Compute and store the expected throughput Rt 
as in dHJ. 

9) end if 

10) end for 

11) end for 

12) Pick the set of lengths /i, 



, /,„ that maximizes Rt 



Fig. 2. Algorithm for Selecting Lengths Ii 



The points on each curve in Fig. [T] represent values of k 
ranging from 16 to 256 information bits. Fig. [T] shows, for 
example, that by allowing up to four retransmissions (m = 5) 
with k = 64 RCSP can achieve 91% of capacity with an 
average block length of 102 symbols. 

IV. Comparison of RCSP and Convolutional Codes 

RCSP makes the rather optimistic assumption that a family 
of rate-compatible codes can be found that performs, at each 
rate, equally well as codes that pack decoding spheres so 
well that they use all of the available volume. Lattice theory 
would suggest that such codes might be hard to find since the 
packing density of lattices actually decreases as dimension 
increases. However, we will show in this section that a rate- 
compatible tail-biting convolutional code can indeed match the 
performance of RCSP, at least for m = 5. 

A. Two Convolutional Codes 

We consider two rate=l/3 convolutional codes 
from [17]: a 64-state code with generator polynomial 
(5i,ff2,ff3)=(133,171,165) and a 1024-state code with 
((7i,g2,53)=(3645,2 133,3347), where the generator notation is 
octal. We restrict our attention to tail-biting implementations 
of these convolutional codes because the throughput efficiency 
advantage is important for the relatively small block lengths 
we consider Simulations study the performance of these two 
codes in the MIRF setting for the AWGN channel with SNR 
2 dB, as shown in Fig. [T] The simulations presented here 
focus on the k — 64 case. 

The transmission lengths li used in the simulations are 
exactly those identified by the algorithm presented in Fig. |2] 
Table HI] shows the results of the m — 5 optimization (i.e., 
the set of lengths li found to achieve the highest throughput). 
Thus our simulations used Ii = 85, I2 = 12, 13 = 8, /4 = 
12, /s = 16. The induced code rates of the cumulative blocks 
are 64/85=0.753, 64/97=0.660, 64/105=0.610, 64/117=0.547 
and 64/133=0.481. Note that the initial code rate is above the 
channel capacity of 0.685. 

Hagenauer pioneered rate-compatible puncturing of con- 
volutional codes in 1988 |3| and provided an approach for 
optimization based on periodic puncturing. While we are 
currently exploring optimized puncturing that matches the 
specific needs of MIRF, the results we present here employ 
pseudorandom (but rate-compatible) puncturing. 

TABLE II 

Results of the algorithm of Fig.[2]for m = 5 and an AWGN 

CHANNEL WITH SNR 2 DB. 
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Fig. 3. A compai'ison of the decoding eiTor trajectories of the sphere-packing 
analysis and simulated convolutional codes, for up to five transmissions with 
optimized block lengths. 

B. Decoding Error Trajectory Comparison 

In addition to prescribing the lengths the algorithm in 
Fig. 12] also computes the joint decoding errors of (|6]), which 
we call the "decoding error trajectory" prescribed by RCSP. 
If we can find a rate-compatible family that can achieve 
this decoding error trajectory, then we can match the RCSP 
performance. Fig. l3] shows the decoding error trajectory for 
m — 5 transmissions and k — 64 information bits, for 
both the sphere-packing analysis and the convolutional code 
simulations. While the 64-state code is not powerful enough 
to match RCSP performance, the 1024-state code closely 
follows the RCSP trajectory. Thus there exist practical codes, 
at least in some cases, that achieve the idealized performance 
of RCSP. Indeed, Fig. [T] plots the {X, Rt) points of the two 
convolutional codes, demonstrating that the 1024-state code 
achieves 91% of capacity with an average latency of 102 
symbols, almost exactly coinciding with the RCSP point for 
m — 5 and k = 64. The convolutional code's ability to match 
a mythical sphere-packing code is due to maximum likelihood 
(ML) decoding, which has decoding regions that completely 
fill the multidimensional space (even in high dimensions). 

C. Caveat 

These simulation results assume that the receiver is able 
to recognize when it has successfully decoded. This same 
assumption is made by the RCSP analysis, Polyanskiy et al., 
and Chen et al. While this assumption does not undermine 
the essence of this demonstration of the power of feedback, 
its practical and theoretical implications must be reviewed 
carefully, especially when very short block lengths are consid- 
ered. An important practical implication is that the additional 
overhead of a CRC required to avoid undetected errors will 
drive real systems to somewhat longer block lengths than those 
presented here. This will affect the choice of error control 
code. An important theoretical implication is that this analysis 



cannot be trusted if the block lengths become too small. This 
assumption allows block errors to become block erasures at 
no cost. Consider the binary symmetric channel (BSC). If 
the block length is allowed to shrink to a single bit, then 
this seemingly innocuous assumption turns the zero capacity 
BSC with transition probability 1/2 into a binary erasure 
channel with probability 1/2, which has a capacity of 1/2 
instead of zero. Both the practical and theoretical problems 
of this assumption diminish as block length grows. However, 
a quantitative understanding of the cost of knowing when 
decoding is successful and how that cost changes with block 
length is an important area for future work. 

V. RCSP WITH Latency and Outage Constraints 

As presented, MIRF has an outage probability of zero 
because it never stops trying until a message is decoded 
correctly. With slight modifications, we can adapt the MIRF 
scheme and algorithm presented in Fig.|2]to incorporate strict 
constraints on latency (so that the transmitter gives up after m 
transmissions) and outage probability (which would then be 
nonzero). 

To handle these two new constraints, we restrict the proba- 
bility of decoding error after m transmissions, P(Ci, • ■ • , Cm)' 
to be less than a specified threshold Poutage- Without modifying 
the computations of the error probabilities P(Ci , . . . , Cj ) in 
we update step [12] of the algorithm to the following: 

Pick the set of lengths that yields the optimal (maximum) 

throughput s.t. P(Ci, ■ • ■ , Cm) < Poutage ■ 

Now, when there is a decoding error on the mth transmis- 
sion, the transmitter declares an outage event and proceeds 
to encode the next k information bits. This scheme may 
be suitable for delay-sensitive multimedia communication, in 
which data packets are not useful to the receiver after their 
decoding deadlines have passed. 

The error probabilities P(Ci,...,Cj) still given by 
(O, though the expected number of channel uses A and the 
expected number of transmissions r are now given by 

m / ^ ^ 1 \ m — 1 / \ 

A = h+Y^i.p n u = 1+E ^ n • (11) 

i=2 \j=l J i=\ \j = l j 

The expected throughput is again given by ([8]l. We update 
the algorithm in Section IIII-BI to compute the latency A using 
dTTT i instead of Q. The main difference now is that there is a 
nonzero probability of outage. 

VI. Conclusion 

The purpose of this paper is to further the conversation 
that will bring the information theory of feedback and the 
communication practice of feedback closer together Begin- 
ning with the idealized notion of rate-compatible codes with 
decoding spheres that completely fill the available volume, 
the paper eventually demonstrates a convolutional code with 
performance strikingly similar to the ideal rate-compatible 
sphere-packing (RCSP) codes. 



An algorithm is presented that uses RCSP analysis to 
find the highest throughput that can be achieved for a fixed 
information length k and a fixed number of retransmissions. 
This algorithm also provides the transmission lengths of the 
initial and subsequent transmissions and the sequence of 
decoding error probabilities or "decoding error trajectory" that 
characterizes the throughput-maximizing performance. RCSP 
predictions and simulation results agree in demonstrating that 
feedback permits 90% of capacity to be achieved with about 
100 transmitted symbols assuming that the decoder knows 
when it has decoded correctly. However, the implications 
of this assumption for short block lengths warrant further 
investigation and the development of analysis that does not 
hinge on this assumption. 
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